Data Fetching Patterns

Learn about the different techniques used to fetch data from servers.

Introduction#

Servers can send a large amount of data to connected clients daily in real-world applications. HTTP is predominantly used on the Internet to send and fetch content from servers. On top of that, data fetching requests are millions per second in read-heavy applications such as Twitter, Quora, and so on. Different applications require a varied communication structure depending on their use cases, for example, a few applications are tolerant to some delays, while others might need near real-time updates. We need a mechanism to deliver real-time data updates between the client and server. For example, real-time communication is needed with a navigation system, where the server tells the directions, and the client must repeatedly send its location.

This lesson will discuss data polling techniques to send and fetch data to and from the server. The main polling methods are given below:

  • Short polling

  • Long polling

Additionally, this lesson also discusses WebSocket, a full-duplex protocol used to address the limitations of polling techniques and provide real-time communication between the client and server. It also enables clients and servers to avoid polling of data. Let’s discuss the mentioned approaches one by one

Short polling#

With short polling, the client initiates a request to the server for a particular kind of data at regular intervals (usually < 1 minute). When the server gets the request, it must respond with either the updated data or an empty message. Even though the small polling interval is nearly real-time, deciding the polling frequency is crucial. Short polling is viable if we know the exact time frame when the information will be available on the server. The illustration below represents the short polling procedure between the client and server:

Client
Client
Server
Server
Interval
Interval
Interval
Interval
Empty response
Empty response
Response with content
Response with content
Empty response
Empty response
GET request
GET request
GET request
GET request
GET request
GET request
Viewer does not support full SVG 1.1
Short polling approach making needless requests to a particular server

Problems with the short polling approach are the following:

  • Needless requests when there are no frequent updates on the server-side. In this case, clients will mostly get empty responses as a result.

  • When the server has an update, it can’t send it to the client until the client’s request comes.

Let’s take a look at the following illustration to understand the problems mentioned above better:

Interval
Interval
delay
delay
update
update
Client
Client
Server
Server
Empty response
Empty response
Response with content
Response with co...
GET request
GET request
GET request
GET request
Viewer does not support full SVG 1.1
Delay problem in short polling

The illustration shows that the server has new data to send immediately after the first response. But the client has to wait for a predetermined interval until the next request to receive the data arrives.

Long polling#

Long polling operates the same way as short polling, but the client stays connected and waits for the response from the server until it has new updates. This approach is also referred to as hanging get. In long polling, the server does not give an empty response; there is an acceptable waiting time after which the client has to request again. However, this strategy is appropriate when data is processed in real-time to minimize long waiting intervals for a response. But it does not deliver significant performance because the client may have to reconnect to the server multiple times after a timeout to get new data. Let’s see the illustration below to understand how the client faces idle time and places a new request if there is no server-side update after an interval.

Update
Update
Client
Client
Server
Server
Idle
Idle
Idle
Idle
Reached waiting time threshold
Reached waiting time...
Re-request
Re-request
Response
Response
GET request
GET request
GET request
GET request
Viewer does not support full SVG 1.1
An idle issue in long polling

Problems that come with long polling include:

  • The delays while waiting for the occurrence of an update or timeouts.

  • The server has to manage the unresolved states of numerous connections and their timeout details.

  • The client’s need to establish multiple concurrent connections to send new information by sending another request. Otherwise, it needs to wait for a response or timeout of the already established connection.

Note: The TCP connection between the client and server used by the  approaches discussed above can be persistent or non-persistent. The client or server opens a TCP connection once to establish a persistent connection. However, for non-persistent, it regularly initiates TCP/IP connections for each request.

Q

(Select all that apply.) Identify the advantage(s) of using persistent TCP connections from the given options.

Selected Option
A)

Reduced network traffic

Explanation

Multiple requests can be sent using a single TCP connection.

B)

Increase connection management cost

Explanation

This reduces the count of the round trip time (RTT) by establishing a TCP connection once so it does not increase management costs.

Not Selected
C)

Decreased latency

Explanation

A single persistent connection decreases the latency by reducing the three-way handshake time.

D)

Increase memory consumption

Explanation

Memory consumption does not relate to TCP connections.

WebSocket#

WebSocket is a persistent full-duplex communication protocol that runs on a single TCP connection. In real-time applications, clients sometimes need to send requests frequently to the server. For instance, if a request is initiated by the client and receives part of the response instead of a complete response, the client can send another request over the same TCP connection. This is achieved by requesting to upgrade the connection from TCP to WebSocket. At the same time, we need low latency and efficient resource utilization. For this, WebSocket is an ideal choice, allowing full-duplex communication over a single TCP connection. Let’s understand the discussed example with the following illustration, where clients send two requests over a single TCP connection:

Communication between the client and the server using WebSocket
Communication between the client and the server using WebSocket

In the illustration above, the client and server can communicate data whenever needed, and such scenarios are important for many real-time applications. The stateful attribute of the WebSocket also gives an advantage by allowing for the reuse of the same open TCP connection. This approach is suitable for multimedia chat, multiplayer games, notification systems, and so on.

A WebSocket connection has to be upgraded from a typical HTTP connection. More detail on this approach is available in this course’s lesson on WebSocket.

Q

Consider creating a real-time messaging application for an organization’s communication between multiple international offices. Which approach will be appropriate for this case?

Your Answer
A)

Short polling

Explanation

Latency in short polling can be high and not suitable for real-time communication. Additionally, both parties might need a separate connection for communication.

B)

Long polling

Explanation

Latency might be high for those cases when a client request is not outstanding, and there’s a message from the server to be delivered (but can’t be delivered yet because no client requests are pending).

Correct Answer
C)

WebSockets

Explanation

This approach allows two-way communication, so any party (client or server) can send the message as soon as it’s available.

D)

None of the above.

Comparison of data polling approaches#

Now, we’ll compare each approach considering a real-time application. The table below compares the data polling approaches based on the principal factors that help decide which approach is suitable in a given scenario.

Data Polling Approaches

Low Latency


Efficient Bandwidth Usage

Full Duplex

Browser's Compatibility

Short polling

No

(can lower by it using 0 timeout, but at the expense of poor network use)

No

(can be better by increasing timeout but at the expense of higher delay)

No

Yes

Long polling

Yes

(better than short polling)

Yes

(better than short polling)

No

Yes

WebSocket

Yes

Yes

Yes

Yes

Summary#

Latency is one of the most critical aspects of any API design, and we want to minimize it for the clients. In this lesson, we discussed different ways to communicate API (short polling, long polling, WebSocket) with low delay and efficient network use. Depending on a specific use case, one might be preferred over the other. Short polling can add a substantial delay by introducing a wait time between two requests. Long polling tackles this problem by sending a request that can remain outstanding on the server for some time. In long polling, WebSocket tackles the problem where either the client or the server can talk to each other in a full-duplex mode.

Client-Adapting APIs

Event-Driven Architecture Protocols